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Abstract. In [IMR01] we advocated the investigation of robustness of 
results in the theory of learning in games under adversarial scheduling 
models. We provide evidence that such an analysis is feasible and can lead 
to nontrivial results by investigating, in an adversarial scheduling setting, 
Peyton Young's model of diffusion of norms [You98]. In particular, our 
main result incorporates contagion into Peyton Young's model. 
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1 Introduction 

Game-theoretic equilibria are steady-state properties] that is, given that all the 
players' actions correspond to an equilibrium point it would be irrational for any 
of them to deviate from this behavior when the others stick to their strategy. 
The fundamental problem facing this type of concept is that it does not predict 
how players arrive at this equilibrium in the first place, or how they "choose" 
one such equilibrium, if several such points exist. The theory of equilibrium 
selection of Harsanyi and Selten [HS88] assumes some form of prior coordination 
between players, in the form of a tracing procedure. This strong prerequisite is 
often unrealistic. 

The theory of learning in games [FL99] attempts to explain the emergence of 
equilibria as the result of an evolutionary "learning" process. Models of this type 
assume one (or several) populations of agents, that interact by playing a certain 
game, and updating their behavior based on the outcome of this interaction. 

Results in evolutionary game theory are important not necessarily as realis- 
tic models of strategic behavior. Rather, they provide possible explanations for 
experimentally observed features of real-world social dynamics. For instance, the 
fundamental insight behind the concept of stochastically stable strategies is that 
continuous "noise" (or small deviations from rationality) can provide a solution 
to the equilibrium selection problem in game theory, discussion on the role of 
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strategic learning in equilibrium selection see [PY05] ) Similar issues apply when 
mathematical modeling is replaced with computer experiments, in the area of 
agent-based social simulation [GT05]. Epstein [Eps07] (see also [AE96]) has ad- 
vocated a generative approach to social science: in order to better understand a 
given phenomenon one should be able to generate it via simulations. 

Given that such mathematical models or simulations are emerging as tools 
for policy-making (see e.g. [NBB99,ECC + 04]), how can we be sure that the 
conclusions that we derive from the output of the simulation do not crucially 
depend on the particular assumptions and features we embed in it ? Part of 
the answer is that these results have to display "robustness" with respect to the 
various idealizations inherent in the model, be it mathematical or computational. 

Various issues that might impact the robustness of the conclusions have been 
previously considered in the game-theoretic literature; for instance, the cele- 
brated result of Foster and Young [FY90] can be viewed as investigating the 
robustness of Nash equilibria with respect to the introduction of small amounts 
of random noise (or player mistakes) . 

In this paper we are only concerned with one such issue: scheduling, i.e. the 
order in which agents get to update their strategies. Two alternatives are most 
popular, both in the mathematical and the computer simulation literature: in the 
synchronous mode (every player updates at every step. A popular alternative is 
uniform matching. Models of the latter type assume an underlying (hyper)graph 
topology (describing the sets of players allowed to simultaneously update in 
one step as a result of game playing) and choose a (hyper) edge uniformly at 
random from the available ones. Employing a uniform matching model in mul- 
tiagent models of social systems is unrealistic for it assumes perfect and global 
randomness; it is not clear whether this assumption is waranted in the "real 
life" situations that the theory is supposed to model. Indeed, notwithstanding 
the question regarding the existence of computational randomness in nature, the 
structure of social interactions is neither random, nor uniform, and comprises 
many regular, "day by day" interactions, as well as a smaller number of "oc- 
casional" ones. A random matching model does not take into account locality 
and cannot, therefore, adequately model "contagion" effects (i.e. players becom- 
ing activated as a result of some of their neighbors doing so) . On the other 
hand, social systems are inherently distributed, and it is not clear whether the 
assumption of global randomness is reasonable in a simulation setting. 

We investigate in an adversarial setting Peyton Young's model of evolution 
of norms [You98] (see also [You03]). The dynamicsmodels an important aspect 
of social networks, the emergence of conventions, and has been proposed as an 
evolutionary justification for the emergence of certain rules in the pragmatics 
of natural language [Roo04]. Our results can be summarized as follows: results 
on selection of strict-dominant equilibria under random noise extend (Theo- 
rem 9) to a class of nonadaptive schedulers. However, such an extension fails 
for adaptive schedulers, even those with fairness properties similar to those of a 
random scheduler. Our main result (Theorem 10) extends the convergence to the 
strictly-dominant equilibrium to a class of "nonmalicious" adaptive schedulers 



that models contagion and has a certain reversibility property (the class of such 
schedulers includes the random scheduler as a special case). However for this 
class of schedulers the convergence time is not necessarily the one from the case 
of random scheduling. 

Besides the relevance of our results to evolutionary game-theory, we hope that 
the concepts and techniques relevant to this paper can be fruitfully exploited in 
the theory of rapidly mixing Markov chains, of great interest in Theoretical 
Computer Science. 

Not surprisingly, our framework is related to the theory of self-stabilization 
of distributed systems [DolOO]. Our proofs highlight some principles and tech- 
niques of this theory (the existence of a winning strategy for scheduler-luck games 
[DolOO] , monotonicity and composition of winning strategies) that can be applied 
to the particular problem we study, and conceivably in more general settings as 
well. 

2 Preliminaries 

A general class of models for which adversarial analysis can be naturally consid- 
ered is that of population games [BluOl]. Systems of interest in this class consist 
of a number of agents, defined as the vertices of a hypergraph H = (V, E) . One 
edge of this hypergraph represents a particular choice of all players who can play 
(one or more simultaneous instances of) a game G that defines the dynamics. 
Each player has a state (generally a mixed strategy of G) chosen from a certain 
set S. The global state of the system is an element of S — S v . The dynamics 
proceeds by choosing one edge e of H (according to a scheduling mechanism 
that is specified by the scheduler), letting the agents in e play the game, and 
updating their states as a result of game playing. 

2.1 Schedulers 

Denote by X* the set of finite words over alphabet X. 

Definition 1. A deterministic scheduler is specified by a mapping f : E* x5^ 
E, where E is the set of edges of H and S is the set of possible states of the 
system. Mapping f specifies the next scheduled element, given the current history. 
Let b > 1. A scheduler that can choose one item among a set of m elements is 
(worst-case) 6-fair if every agent is guaranteed to be scheduled at least once in 
any sequence of b(m — 1) + 1 consecutive steps. 

One particularly restricted class of schedulers is that of non-adaptive sched- 
ulers, corresponding to updates of the nodes/edges according to a fixed permu- 
tation, independent of the initial state of the system. 

The above definitions are well-suited for deterministic schedulers. They are 
not well-suited for probabilistic schedulers (such as random matching) , since for 
any fixed number of steps B with positive probability it will take more than B 
steps to schedule each element at least once. Also, the definitions do not allow 



for multiple agents to be scheduled simultaneously. Therefore in this case we 
need to employ slightly different definitions. 

Definition 2. A (probabilistic) scheduler assigns a probability distribution p WtS 
on E to each pair (w,s,sq) consisting of initial prefixes w G E*,s G S with 
\w\ — \s\ and starting state s G S. The next element e G E to be scheduled, 
given prefixes w, s and initial state So, is sampled from E according to p WlS ,s - 

A non-adaptive probabilistic scheduler is specified by (a) a collection (multi- 
set) £ = {T>i, . . . ,T> m } of probability distributions on the set E such that every 
x G E belongs to the support of some distribution T>i and (b) a fixed permutation 
7r of S . The scheduler proceeds by (possibly concurrently) scheduling elements 
of E sampled from a distribution from S chosen according to ( consecutive rep- 
etitions of) permutation ir. For C > 0, a non-adaptive probabilistic scheduler is 
C individually-fair if for every x G E, the probability that x is scheduled during 
one round of tt is at least C/\E\. 

One can define, for any given triple (w,s,So), where w G E*,s G S and 
so € S, a probability 7r WiSjSo , the probability that, starting from state sq the 
scheduler uses w as the initial prefix of its schedule and evolves its global state 
according to string s. Let fl denote the resulting probability space. We divide 
each trajectory of a probabilistic scheduler into rounds: the first round is the 
smallest initial segment that schedules each element of E at least once, the 
second round is the smallest segment starting at the end of the first round that 
schedules each element at least once, and so on. Given this convention, it is easy 
to see that for any k > and s G S the family Wk of strings w consisting of 
exactly k rounds realizes a complete partition of the probability space f2, i.e. 
^2wew k n w,s = 1- 

Definition 3. // /(•) is a function on integers, we say that a family of prob- 
abilistic schedulers, indexed by n, the cardinality of the set E, is 0(/(n))-fair 
w.h.p. if there exists a monotonically decreasing function g : (0,oo) — ► (0,1), 
with lim^oo g(e) = such that for every state s G S, denoting by U the ran- 
dom variable measuring the length of the i 'th round, we have lim r ^^ Prob^ > 
e-/(n)] <g(e). 

Random scheduling can be specified by a non-adaptive probabilistic scheduler 
whose set S consists of just one distribution, namely the uniform distribution on 
E. This scheduler is 1-individually fair and, by the well-known Coupon Collector 
Lemma it is also 0(n log(n))-fair w.h.p. 

2.2 Peyton Young's model of norm diffusion 

The setup of this model is the following: agents located at the vertices of a 
graph G interact by playing a two-person symmetric game with payoff matrix 
M = (TJi,j)ije{A,B} displayed in Figure 1. It is assumed that strategy A is a 
so called strict risk-dominant equilibrium. That is, we have a — d > b — c > 0. 
Each undirected edge {i, j} has a positive weight Wij — Wji that measures its 



"importance" . When scheduled, agents play (using the same strategy, identified 
as the agent's state) against each of their neighbors. If agent i is the one to 
update, x is the joint profile of agents' strategies, and z e {A, B} is the candidate 
new state, p@(xi — > z\x) ~ e 0-M z , x -i) ; where Vi(z,x~-i), the payoff of the i'th 
agent should he play strategy z while the others' profile remains the same is 
given by fi(z,x_i) — j\ eE Wijm ZtXj . Under random scheduling, the process 
we defined is a variant of the best-response dynamics. This latter process (viewed 
as a Markov chain) is not ergodic. Indeed, the since in game G it is always better 
to play the same strategy as your partner, the dynamics has at least two fixed 
points, states "all A" and "all B" . 
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Fig. 1. Payoff matrix 



An important property of Peyton Young's dynamics is that it corresponds 
to a potential game: there exists a function p : V — > R such that, for any player 
i, any possible actions 01,02 of player i, and any action profile o of the other 
players, it j (01 , o) — m (a 2 , a) = p(ai , a) — p(a2 , o) (where Ui is the utility function 
of player i). In other words changes in utility as a result of strategy update 
correspond to changes in a global potential function. An explicit potential is 
given by p*(x) = T,( h ,k)eE w h,km Xh , Xk . 

2.3 Stochastic stability 

A fundamental concept we are dealing with is that of a stochastically stable state 
for dynamics described by a Markov chain. 

Definition 4. Consider a Markov process P° defined on a finite state space Q. 
For each e > 0, define a Markov process P e on H. P e is a regular perturbed 
Markov process if all of the following conditions hold. 

— P e is irreducible for every e > 0. 

— For every x,y E H, lim e>0 P xy = P xy . 

— If P X y > then there exists r(m) > 0, the resistance of transition m = (x — > 
y), such that as e -> 0, P£ y = 0{e r ^). 

Let pf be the (unique) stationary distribution of P € . A state s is stochastically 
stable if lim £ _, p e (s) > 0. 

Peyton Young's model of diffusion of norms can be easily recast into the 
framework of Definition 4, by defining e = exp(—/3). 



Observation 1 Peyton Young 's model of diffusion of norms can be recast into 
the framework of Definition 4- Let e = exp{—(3). As (3 — > oo, e — > 0. Consider 
now the Markov chain r e corresponding to the original dynamics. It has tran- 
sition matrix D e — Di, e . . .D m ^, where Di^ = (d\ k ; ) is the transition matrix 
corresponding to scheduling (and updating) a node according to T>i. It is easy to 
see that hm e ^ d\ l — dk.i- Moreover, by the nature of the dynamics, as e — > 
each element of D it€ is either zero (in case the state transition k — > I cannot 
be realized by updating any single node member of supp{T>i)), tends to a posi- 
tive constant which is the probability that the node corresponding to the transition 
k — > I is chosen (in case the transition k — > I corresponds to a "best reply" move), 
or tends to zero, asymptotically like 0(e ri - k - 1 ), for some r^kj > (otherwise). 

Definition 5. A tree rooted at node j is a set T of edges such that for any 
state w j there exists a unique (directed) path from w to j . The resistance of 
a rooted tree T is the sum of resistances of all edges in T . 

The following characterization of stochastically stable states is presented as 
Lemma 3.2 in the Appendix of [You98]: 

Proposition 6. Let P e be a regular perturbed Markov process, and for each 
e > let if be the unique stationary distribution of P e . Then lim e _>o /i c = fjr 
exists, and fi is a stationary distribution of P° . The stochastically stable states 
are precisely those states z such that there exists a tree rooted at z of minimal 
resistance (among all rooted trees). 

Definition 7. Given a graph G, a nonempty subset S of vertices and a real num- 
ber < r < 1/2 we say that S is r-close-knit if MS' C S, S' ^ 0, ^f/fjg^ > r, 
where e(S' , S) is the number of edges with one endpoint in S' and the other in 
S, and deg(i) is the degree of vertex i. A graph G is (r, fc)-close-knit if every 
vertex is part of a r-close-knit set S, with \S\ = k. 

Definition 8. Given p G [0,1], the p-incrtia of the process is the maximum, 
over all states xq G S, of W(/3,p, Xo), the expected waiting time until at least 
1 — p of the population is playing action A conditional on starting in state x ■ 

The model in [You03,You98] assumes independent individual updates, ar- 
riving at random times governed (for each agent) by a Poisson arrival process 
with rate one. Since we are, however, interested in adversarial models that do 
not have an easy description in continuous time we will assume that the process 
proceeds in discrete steps. At each such step a random node is scheduled. It is a 
simple exercise to translate the result in [You03,You98] to an equivalent one for 
global, discrete-time scheduling. The conclusions of this translation are: 

— The stationary distribution of the process is the Gibbs distribution, fip(x) = 
^p^pL) ' wri ere p is the potential function of the dynamics. 

— "All A" is the unique stochastically-stable state of the dynamics. 

— Let r* = a _ b d + c b _ c , and let r > r* , k > 0. On a family of (r, fc)-close-knit 
graphs the convergence time is 0(n). 



3 Results 



First we note that Peyton Young's results easily extend to non-adaptive sched- 
ulers. Adaptive schedulers on the other hand, even those of fairness no higher 
than that of the random scheduler, can preclude the system from ever enter- 
ing a state where a proportion higher than r of agents plays the risk-dominant 
strategy: 

Theorem 9. The following hold: 

(i) For all non-adaptive schedulers, the state "all A" is the unique stochas- 
tically stable state of the system. 

(ii) Let Q be a class of graphs that are (r, k)- close-knit for some fixed r > r* . 
Let f = f(n) be a class of non- adaptive 0(1) individually fair schedulers. Given 
any p G (0, 1) there exists a fi p such that for all (3 > (3 P there exists a constant 
C such that the p-inertia of the process (under scheduling given by f) is at most 
C -m-n, where m = m(n) is the number of rounds of f and n is the number of 
vertices of the underlying graph. 

(Hi) For every < r < 1 there exists an adaptive scheduler which is 0(nlog(n))- 
fair w.h.p. (where the constant hidden in the "0" notation depends on r) that 
can forever prevent the system, started on the "all B 's " configuration, from ever 
having more than a fraction of r of the agents playing A. 

(i) Let i eV and let Mj be the restriction of the given dynamics corresponding 
to the case when only one node, node i, is scheduled in all moves (otherwise 
the dynamics is similar to the original one). 

It is easy to see that Mi is a non-ergodic Markov chain and that fip is 
a stationary distribution for the Markov chain Mi. This is so because for 
two configurations x, y that only differ in position i, the ratio of transition 
probabilities p x .y / Py,x is equal to to exp[/3 ■ (p* (x) — p*(y))], which is precisely 

Now consider the matrix corresponding to the distribution with the same 
notation as in the periodic schedule. It is a convex combination of the ma- 
trices M i} hence it will also have [ip as a stationary distribution. We infer 
that the product of matrices Dk corresponding to the cyclic schedule also 
has as a stationary distribution. 

But it is easy to see that the Markov chain corresponding to one round of 
the cyclic schedule is irreducible (since one can navigate between any two 
states in at most \V\ rounds, by flipping the differing bits and keeping the 
bits that coincide fixed) and aperiodic (since the probability of remaining 
in a given state is positive). Therefore, it must have an unique stationary 
distribution, which is necessarily fj,p. 

(ii) Consider a vertex v £ V and a r-close-knit set of size k containing v, de- 
noted S v . Consider r Vt p the version of the process where each vertex in S v 
updates just as before, but each vertex in V \ S v always chooses state B 
when scheduled. 



This restricted dynamics on V still corresponds to a potential game, specified 
by potential function 

P*(x) = J2 P( x i, x j)> fOT x £ {A,B} S "B V \ S \ 
(iJ)eE 

and with the Gibbs distribution IjP{x) = ^ eP-p*^) as ^ s stationary dis- 
tribution. Again, just as in [You93] (since the precise scheduling order does 
not play a role in this result) the condition that G is (r, fc)-close-knit implies 
that the state As defined as "all A" on S v and "all B" onF\S„ is the state 
with the highest potential among the possible states of the system. 
One additional complication of the dynamics r v _p is that it often schedules 
(unnecessarily) nodes outside S v , that do not change. Consider E v> p that is 
the version of r v> p that "only schedules nodes in S v " (i.e. it ignores moves 
of r Vi p that schedule nodes outside of S v ). 

To describe this dynamics formally, view each distribution Di as a set of 
symbols from the alphabet V. Then the set of trajectories of the dynamics 
r Vt @ can be specified by the words of the regular language Lp = {D\ ■ D2 ■ . ■ ■ ■ 
D m )* . Trajectories of S Vj p correspond to words in another regular language 
Ls, more precisely to the ones corresponding to deleting symbols in V \ S v 
from words in Lp- This regular language can be specified by the regular 
expression {{D\ U {A}) • . . . • (D m U {A}) n £+)*. This expression yields a 
matrix of size 2l 5 "l x 2l s "l for S Vl p. 

Claim. For every e > there exists 77 e N such that, for every w 1 > r\ £ N 
any initial state of r v> p and every state T £ {A,B} 5, \ 

\Pr[r Vt p in state T \w\ = r] ■ m ■ n] - II(T)\ < e. 

Proof. Let e > 0. As E Vt p converges to its stationary distribution, there 
exists k > such that Vfc' > k and every initial state of S v> p 

\Pr[E Vt p in state T\\w\ = k'] - fl(T)\ < e/2, (1) 

where 77 is the stationary distribution of dynamics S. Of course, states with 
positive support in 77 have the same probability in 77, that is 

VT G {A, B} 5 ": 77 (T) = 77 (v). 

Let Y be a random trajectory of length 77' • m ■ n in lp and let pr(Y) its 
projection onto Ls ■ 

Claim. There exists 77 > such that V77' > n 

Prob [\pr(Y)\<k]< e -. 

\Y \—r> -m-n Z 



Proof. The probability that any given distribution Di whose support in- 
cludes some element in S v will schedule (in a given round) a node in this set 
is 0(l/n), by the fairness condition. There is at least one such Di among all 
the m distributions. Therefore, the expected length of pr(Y) is Q{k/n). A 
simple application of Markov's inequality gives the desired result. 

□ 

Now write 

Pf [r v ,p in state T\\w\ = rj ■ m ■ n] 
= J2j Pr [ r v,0 in state T \ \w\ = rj' ■ m ■ n, \pr(w)\ = j] 
■ Prob[\pr(w)\ = j] 

Therefore we have 

Pr [r Vi/3 in state T\\w\ =rf -m-n)- TI{t)\ < 
< Ej \Pr[r Vtf ) in state T \ \w\ = rf ■ m ■ n, \pr(w)\ = j] - 77(T)| • 
• Prob[|pr(w)| = j] 

The first term in the product is an absolute difference between two probabil- 
ity values, and thus has absolute value at most one. Therefore, by Claim ii, 
if we neglect in the sum those terms with j < k only changes the sum by at 
most e/2. On the other hand 

Pr[r v ,/3 in state T \ \w\ = rj' • m ■ n, \pr(w)\ = j] 
= Pr{E v ,f3 in state T\\w\ — k'\. 

Now, using Equation (1), Claim ii follows. 

□ 

From now on the proof mirrors rather closely the one for the case of ran- 
dom scheduling (presented in the Appendix to [You93]): first, because the 
stationary distribution of process r is the Gibbs distribution, there exists a 
finite value /3(r, S,p) such that fi v (A s ) > 1 - p 2 /2 for all (3 > 0(r, S,p). 
There are only a finite number of nonisomorphic dynamical systems P v ,f) 
(where isomorphism of dynamical system is meant to be the isomorphism of 
the underlying graph topologies S v and of the projection of schedulers onto 
S v ). In particular we can find (3{r, k,p) and r)(r, k,p) such that, for all graphs 
G e Q and all r-close-knit subsets S with k vertices, the following relation 
holds for all initial states: 

V0>0(r,k,p),Vrf >V,Pr[y r ''' m " n = As} >l-p 2 . (2) 

where y* is the state of the dynamical system on state S v at time t. We can 
now derive that for every close-knit set S 

V/3 > (3(r,k,p),Vr)' > r], Pr[x"'- m '" = A s ] >l-p 2 . (3) 



where x t is now the state of the process from the theorem. The argument 
is obtained via essentially the same coupling as the one from [You98] , hence 
it is omitted from this writeup. Since every vertex i is contained in a (r, k)- 
close-knit set, it follows that 

V0>0(r,k,p),Vrf > r], Pr[xf' m ' n = A] > 1-p 2 . 

Therefore the expected proportion of vertices playing action A at time rf-m-n 
is at least (1 — p 2 )n. 
But this implies that 

V£ > rj ■ m ■ n, Pr[ at least (1 — p)n nodes have label A] > 1 — p. 

Indeed, if this wasn't the case, then with probability at least p more than pn 
nodes at time t would have label B, a contradiction. Now, by an application 
of Markov's inequality, the expected time until at least (1 — p)n nodes are 
labelled A is at most 7]-m-n/(l—p). Since this holds for all graphs G in Q, 
the p-inertia of the process is bounded as stated in the theorem, 
(iii) Consider a scheduler working in rounds. In each round the scheduler is 
scheduling nodes according to a fixed permutation n, the same for all rounds. 
In each round the scheduler is scheduling each node at least once. For the 
first \rn~\ + 1 nodes the scheduler continues scheduling each of them (after 
the initial one) until the node switches to strategy B. The scheduler plays 
each remaining node exactly once. 

It is easy to see that there exists a constant e > (that may depend on (3) 
such that, at each stage, each agent switches to strategy B with probability 
greater or equal to e. 

Therefore the probability that any given agent needs to be scheduled for 
more than clog(n) rounds before turning to B is o(l/n) for large enough c. 
It follows that the given scheduler is 0(n log(n))-fair w.h.p. 

3.1 Main result: Diffusion of norms by contagion 

Adaptive schedulers can display two very different notions of adaptiveness: 

(i) The next node depends only on the set of previously scheduled nodes, or 

(ii) It crucially depends on the states of the system so far. 

The adaptive schedulers in Theorem 9 (iii) was crucially using the second, 
stronger, kind of adaptiveness. In the sequel we study a model that displays 
adaptiveness of type (1) but not of type (2). The model is specified as follows: 
To each node v we associate a probability distribution D v on the vertices of 
G. We then choose the next scheduled node according to the following process. 
If U is the node scheduled at stage i, we chose the next scheduled node, 
by sampling from D ti . In other words, the scheduled node performs a (non- 
uniform) random walk on the vertices of graph G. To exclude technical problems 
such as the periodicity of this random walk, we assume that it is always the 



case that v G supp(D v ). Also, let H be the directed graph with edges defined 
as follows: (x,y) G E[H] (y G supp(D x j). This dynamics generalizes 

both the class of non-adaptive schedulers from previous result and the random 
scheduler (for the case when H is the complete graph). In the context of van 
Rooy's evolutionary analysis of signalling games in natural language [Roo04], it 
functions as a simplified model for an essential aspect of emergence of linguistic 
conventions: transmission via contagion. 

It is easy to sec that the dynamics can be described by an aperiodic Markov 
chain M on the set on V^ A ^ x V, where a state (w, x) is described as follows: 

— w is the set of strategies chosen by the agents. 

— x is the label of the last agent that was given the chance to update its state. 

If the directed graph H is strongly connected then the Markov chain M is 
irreducible, hence it has a stationary distribution 77. We will, therefore, limit 
ourselves in the sequel to settings with strongly connected H. We will, further, 
assume that the dynamics is weakly reversible, i.e. (x G supp{D y )) if and only if 
(y G supp(D x j). This, of course, means that the graph H is undirected. Note that 
since we do not constrain otherwise the transition probabilities of distributions 
Di, the stationary distribution 77 of the Markov chain does not, in general, de- 
compose as a product of component distributions. That is, one cannot generally 
write II(w, x) as 77 (w, x) = ir(w) ■ p{x), for some distributions ir, p. 

Theorem 10. The set Q = {(w,x)\w = V A } is the set of stochastically stable 
states for the diffusion of norms by contagion. 

Proof. The states in Q are obviously reachable from one another by zero-resistance 
moves, so it is enough to consider one state y G Q and prove that it is stochas- 
tically stable. To do so, by Proposition 6, all we need to do is show that y is the 
root of a tree of minimal resistance. Indeed, consider another state x G Q and 
let T be a minimum potential tree rooted at x. 

Claim. There exists a tree T rooted at y having potential less or equal to the 
potential of the tree T, strictly smaller in case x is not a state having all its 
first-component labels equal to A. 

Let ir ViX = (x ,io) -> {xi,h) (x k ,i k ) (x k+1 ,i k+1 ) 

(x r , i r ) be the path from y to x in T (that is (x , io) = y, {x r , i r ) = x). 

We will define T by viewing the set of edges of T as partitioned into subsets 
of edges corresponding to paths as follows (see Figure 2 (a)): 

(i) The set of edges of path ir yiX . 

(ii) The set of edges of the subtree rooted at y. 

(iii) Edges of tree components (perhaps consisting of a single node) rooted at a 
node of TTy, x , other than y (but possibly being x). 

To obtain T we will transform each tree (path) in the above decomposition 
of T into one that will be added to T. The transformation goes as follows: 



Fig. 2. (a). Decomposition of edges of tree T (b). Resistance of edges on a path between 
two nodes X and Y. 



(i) Instead of path n VtX we add path n x y from x to y defined by: n XtV = 
(x r ,i r ) -> (x r -i,ir) -> (x r -2,i r -i) ->■•.•—> (xchii) -> (x 0j «o) = 2/- 

(ii) Rooted trees of type (2) are included into tree T as well. 

(iii) The transformation is more complicated for the third type of edges, and we 
explain it in detail. Let Wk be a tree component of T, connected to path 
TTy tX at connection point (xk,ik)- 

Case 1: Xk — Xk-i- Then the point (xk,ik) = (xk—i,ik) belongs to path 
n x y as well, so one can just add the rooted tree Wk to T as well. 
Case 2: x^ ^ Xk—i and the move (xk—i, ik—i) ~^ { x kii>k) h as positive re- 
sistance. In this case, since in configuration Xk-i and scheduled node ik we 
have a choice of either moving to Xk or staying in Xk-i, it follows that the 
move (xk,ik) ~~ > {%k-i>ik) has zero resistance. 

Therefore we can add to T the tree Wk = Wk U {(xk, ik) — > (xk-i, "ik)}- The 
tree has the same resistance as the one of tree Wk- 

Case 3: Xk-i ^ Xk and the move {xk-i, ik-i) —> (xk, ik) has zero resistance. 
Let j be the smallest integer such that either x^+j+i = Xk+j or Xk+j+i i= 
Xk+j and the move (xf.+j, ik+j) (xfc+j+i, ik+j+x) bas positive resistance. 
In this case, one can first replace Wk by Wk U {(xk,ik) — ► (xk+x, ife+i), 
(xfc+i,«fc+i) —>-...—> (a^fe+j, ifc+j)} without increasing its total resistance. 
Then we apply one of the techniques from Case 1 or Case 2. 
Case 4: Xk-i Xk, the move (xk-i,ik-i) ~ * (xk,ik) has zero resistance, 
and all moves on 7r a :c , from Xk up to x have zero resistance. Then define 
Wk = Wk U (x fc , ifc) — > (xfe+i, ifc+i) x. 

It is easy to see that no two sets Wk intersect on an edge having positive 
resistance. The union of the paths of all the sets is a directed associated graph W 
rooted at y, that contains a rooted tree T of potential no larger than the potential 
of W. Since transformations in cases (i),(iii) do not increase tree resistance, to 
compare the potentials of T and W it is enough to compare the resistances of 
paths TTy yX and II x ,y 

We come now to a fundamental property of the game G: since it is a potential 
game, the resistance r(m) of a move m = (ai, ji) — > (02,^2) only depends on the 



values of the potential function at three points: a\, 0,2 and a 3 , where CI3 is the state 
obtained by assigning node j'2 the value not assigned by move to Specifically 
r(m) > if either p*{a-i) < p*(ai), in which case r(m) = p*(ai) — ( o*(a 2 ), or 
a 2 = ai and p*(a3) > p*(ai), in which case r(m) = p*(az) — p*(ai). In other 
words, the resistance of a move is positive in the following two cases: (1) The 
move leads to a decrease of the value of the potential function. In this case the 
resistance is equal to the difference of potentials. (2) The move corresponds to 
keeping the current state (thus not modifying the value of the potential function), 
but the alternate move would have increased the potential. In this case the 
resistance is equal to the value of this increase. 

Let us now compare the resistances of paths ir ytX and n x , y . First, the two 
paths contain no edges of infinite resistance, since they correspond to possible 
moves under Markov chain dynamics P € . If we discount second components, the 
two paths correspond to a single sequence of states Z connecting x$ to x r , more 
precisely to traversing Z in opposite directions. (The last move in IJ xy has zero 
resistance and can thus be discounted). Resistant moves of type (2) are taken 
into account by both traversals, and contribute the same resistance value to both 
paths. So, to compare the resistances of the two paths it is enough to compare 
resistance of moves of type (1). Moves of type (1) of positive resistance are those 
that lead to a decrease in the potential function. Decreasing potential in one 
direction corresponds to increasing it in the other (therefore such moves have 
zero resistance in the opposite direction). 

An illustration of the two types of moves is given in Figure 2 (b), where 
the path between X and Y goes through four other nodes, labeled 1 to 4. The 
relative height of each node corresponds to the value of the potential function at 
that node. Nodes 2 and 3 have equal potential, so the transition between 2 and 3 
contributes an equal amount to the resistance of paths in both directions (which 
may be positive or not). Other than that only transitions of positive resistance 
are pictured. 

The conclusion of this argument is that r (n yiX ) — r(II Xiy ) = p*(x) — p*(y) > 0, 
and r{-K y ^ x ) — r(IJ x , y ) > unless x is an "all A" state. 

3.2 The inertia of diffusion of norms with contagion 

Theorem 10 shows that random scheduling is not essential in ensuring that 
stochastically stable states in Peyton Young's model correspond to all players 
playing A: the same result holds in the model with contagion. On the other 
hand, the result on the p-inertia of the process on families of close-knit graphs is 
not robust to such an extension. Indeed, consider the line graph L 2n +i on 2n + 1 
nodes labelled — n, . . . , —1, 0, 1 . . .n. Consider a random walk model such that: 
(a) the origin of the random walk is node 0, and (b) the walk goes left, goes 
right or stays in place, each with probability 1/3. It is a well-known property of 
the random walk that it takes Q{n 2 ) time to reach nodes at distance J7(n) from 
the origin. Therefore, the p-inertia of this random walk dynamics is £2(n 2 ) even 
though for every r > there exists a constant k such that the family {L 2n +i} 
is (r, fc)-close-knit for large enough n. 



In the journal version of the paper we will present an upper bound on the 
p-inertia for the diffusion of norms with contagion based on concepts similar to 
the blanket time of a random walk [WZ96] . 

4 Conclusions and Acknowledgments 

Our results have made the original statement by Peyton Young more robust, and 
have highlighted the (lack of) importance of various properties of the random 
scheduler in the results from [You98]: the reversibility of the random scheduler, 
as well as its inability to use the global system state are important in an adver- 
sarial setting, while its fairness properties are not crucial for convergence, only 
influencing convergence time. Also, the fact that the stationary distribution of 
the perturbed process is the Gibbs distribution (true for the random scheduler) 
does not necessarily extend to the adversarial setting. 
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